Skip to yearly menu bar Skip to main content



Invited Talks
Invited Talk
Björn Ommer

[ Hall E (level 1) ]

Abstract
Invited Talk
Jelani Nelson

[ Hall E (level 1) ]

Abstract

'Sketches' of data are memory-compressed summarizations that still allow answering useful queries, and as a tool have found use in algorithm design, optimization, machine learning, and more. This talk will give an overview of some core sketching tools and how they work, including recent advances. We also discuss a couple newly active areas of research, such as augmenting sketching algorithms with learned oracles in a way that provides provably enhanced performance guarantees, and designing robust sketches that maintain correctness even in the face of adaptive adversaries.

Invited Talk
Susan Murphy

[ Hall F (level 1) ]

Abstract

In this talk I will discuss first solutions to some of the challenges we face in developing online RL algorithms for use in digital health interventions targeting patients struggling with health problems such as substance misuse, hypertension and bone marrow transplantation. Digital health raises a number of challenges to the RL community including different sets of actions, each set intended to impact patients over a different time scale; the need to learn both within an implementation and between implementations of the RL algorithm; noisy environments and a lack of mechanistic models. In all of these settings the online line algorithm must be stable and autonomous. Despite these challenges, RL, with careful initialization, with careful management of bias/variance tradeoff and by close collaboration with health scientists can be successful. We can make an impact!

Invited Talk
Christopher Ré

[ Hall E (level 1) ]

Abstract

I'm a simple creature. I fell in love with foundation models (FMs) because they radically improved data systems that I had been trying to build for a decade–and they are just awesome! This talk starts with my perspective about how FMs change the systems we build, focusing on what I call "death by a thousand cuts" problems. Roughly, these are problems in which each individual task looks easy, but the sheer variety and breadth of tasks make them hard.

The bulk of the talk is about understanding how to efficiently build foundation models. We describe trends in hardware accelerators from a perhaps unexpected viewpoint: database systems research. Databases have worried about optimizing IO – reads and writes within the memory hierarchy – since the 80s. In fact, optimizing IO led to Flash Attention for Transformers.

But are there more efficient architectures for foundation models than the Transformer? Maybe! I'll describe a new class of architectures based on classical signal processing, exemplified by S4. These new architectures: are asymptotically more efficient than Transformers for long sequences, have achieved state-of-the-art quality on benchmarks like long range arena, and have been applied to images, text, DNA, audio, video. S4 will allow us to …

Invited Talk
Linda Smith

[ Hall E (level 1) ]

Abstract

The world presents massive amounts of data for learning but the data relevant to any one thing or event is sparse. I will present evidence from the egocentric experiences of infants and young children in daily lives at home that demonstrate this sparsity, focusing on the case of early visual object recognition and object name learning. I will show how the statistics of infant self-generated experiences present solutions to the problem: learner control and optimization of the input, a developmentally constrained curriculum of spatial and temporal properties of the input, and the coherence statistics of individual episodes of experience. I will present evidence with respect to both low-level visual statistics and higher-level semantic categories. I conclude with a discussion of the alliance of the neural mechanisms that generate the statistics at any point in development and the neural mechanisms do the learning. I will the implications of the findings for artificial intelligence including studies using infant egocentric experiences as training data.

Invited Talk
Lora Aroyo

[ Hall E (level 1) ]

Abstract

Conventional machine learning paradigms often rely on binary distinctions between positive and negative examples, disregarding the nuanced subjectivity that permeates real-world tasks and content. This simplistic dichotomy has served us well so far, but because it obscures the inherent diversity in human perspectives and opinions, as well as the inherent ambiguity of content and tasks, it poses limitations on model performance aligned with real-world expectations. This becomes even more critical when we study the impact and potential multifaceted risks associated with the adoption of emerging generative AI capabilities across different cultures and geographies. To address this, we argue that to achieve robust and responsible AI systems we need to shift our focus away from a single point of truth and weave in a diversity of perspectives in the data used by AI systems to ensure the trust, safety and reliability of model outputs.

In this talk, I present a number of data-centric use cases that illustrate the inherent ambiguity of content and natural diversity of human perspectives that cause unavoidable disagreement that needs to be treated as signal and not noise. This leads to a call for action to establish culturally-aware and society-centered research on impacts of data quality and …

Invited Talk
Alexander Rush · Aakanksha Chowdhery · Angela Fan · Percy Liang · Jie Tang

[ Hall E (level 1) ]

Abstract